Discovering Correction Rules for Auto Editing

نویسندگان

  • Anta Huang
  • Tsung-Ting Kuo
  • Ying-Chun Lai
  • Shou-De Lin
چکیده

This paper describes a framework that extracts effective correction rules from a sentence-aligned corpus and shows a practical application: auto-editing using the discovered rules. The framework exploits the methodology of finding the Levenshtein distance between sentences to identify the key parts of the rules and uses the editing corpus to filter, condense, and refine the rules. We have produced the rule candidates of such form, A B, where A stands for the erroneous pattern and B for the correct pattern. The developed framework is language independent; therefore, it can be applied to other languages. The evaluation of the discovered rules reveals that 67.2% of the top 1500 ranked rules are annotated as correct or mostly correct by experts. Based on the rules, we have developed an online auto-editing system for demonstration at http://ppt.cc/02yY.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying Correction Rules for Auto Editing

This paper describes a framework to extract the effective correction rules from the sentence-aligned corpus and show a practical application: auto-editing using the found rules. The framework exploits the methodology of finding Levenshtein distance between sentences to identify the key parts of the rules and then use the editing corpus to filter, condense and refine the rules. We produce the ru...

متن کامل

MT9V126 Data Sheet

Features • Low-power CMOS image sensor with integrated image flow processor (IFP) and video encoder • 1/4-inch optical format, VGA resolution (640H x 480V) • ±2.5% additional columns and rows to compensate for lens alignment tolerances • Integrated lens distortion correction • Overlay generator for dynamic bitmap overlay • Integrated video encoder for NTSC/PAL with overlay capability and 10-bit...

متن کامل

Discovering Editing Rules For Data Cleaning

Dirty data continues to be an important issue for companies. The database community pays a particular attention to this subject. A variety of integrity constraints like Conditional Functional Dependencies (CFD) have been studied for data cleaning. Data repair methods based on these constraints are strong to detect inconsistencies but are limited on how to correct data, worse they can even intro...

متن کامل

Object-Oriented Identifier Renaming Correction in Three-Way Merge

There are two traditional concurrency models among the source code management (SCM) systems: lock and merge models. The lock model prevents the concurrent modification on the same files, but the merge model allows the parallel editing, and performs a merge to reconcile the changes. A three-way merge engine is a usual part of SCM systems, some of them attempt to auto-merge the files, but sometim...

متن کامل

Critique of Manuscript-Correction/ The Role of Editors in Presenting the Author: A review of Toghray Mashhadi's biography in his newly published Book of Essays, Fatima Mehri

The Role of Editors in Presenting the Author  A Review of Toghray Mashhadi's Biography in His Newly Published Book of Essays  Fatemeh Mehri Associate Professor of Persian Language and Literature, Shahid Beheshti University  [email protected]   Abstract Researchers in the field of editing and correction manuscripts consider the writing of introductions as part of the correction process. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCLCLP

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2010